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- 77ie MAILING DATE of this communication appears on the cover sheet with the correspondence address 
Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the maQing date of this convnunicatton. 

- If the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

• If NO period for reply is specified above, the maximum statutory period wit) apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANEX3NED (35 U.S.C. § 133). 

- Any reply received by the OfTtce later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1 .704(b), 

Status 

Responsive to communication(s) filed on . 



2a)n This action is FINAL. 2b)^ This action is non-final. 

3) n Since this application is in condition for allowance except for fornial matters, prosecution as to the merits is 

closed in accordance with the practice under Exparfe Quayle, 1935 CD. 11, 453 O.G. 213. 
Disposition of Claims 

4) ^ Claim(s) 1-12 and 14-18 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) 0 Claim(s) is/are allowed. 

6) n Claim(s) 1-12. 14-18 is/are rejected. 
?)□ Claim(s) is/are objected to. 

8) 0 Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) 0 The specification is objected to by the Examiner. 

10) 0 The drawing(s) filed on is/are: a)n accepted or bjD objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1.85(a). 

11) 0 The proposed drawing conrection filed on is: a)n approved b)n disapproved by the Examiner. 

If approved, corrected drawings are required in reply to this Office action. 

12) n The oath or declaration is objected to by the Examiner. 
Priority under 35 U.S.C. §§119 and 120 

13) n Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 19(a)-(d) or (0- 

a)nAII b)n Some*c)n None of: 

1 .□ Certified copies of the priority documents have been received. 

20 Certified copies of the priority documents have been received in Application No. . 

30 Copies of the certified copies of the priority documents have been received in this National Stage 
application from the International Bureau (POT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 

14) n Acknowledgment is made of a daim for domestic priority under 35 U.S.C. § 1 19(e) (to a provisional application). 

a) □ The translation of the foreign language provisional application has been received. 

15) n Acknowledgment is made of a claim for domestic priority under 35 U.S.C. §§ 120 and/or 121 . 

Attachment(s) 

1) S Notice of References Cited (PTO-892) 4) □ Interview Summary (PTCMI 3) Paper No(s). 



2) □ Notice of Draflsperson's Patent Drawing Review (PTO-948) 5) □ Notice of Informal Patent Application (PTO-1 52) 

3) D Information Disclosure Statement(s) (PTO-1 449) Paper No(s) . 6) □ Other 
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DETAILED ACTION 

1 . The request filed on 04/14/03 for a Continued Prosecution Application (CPA) under 37 
CFR 1.53(d) based on parent Application No. 09464076 is acceptable and a CPA has been 
established. An action on the CPA follows. 

Response to Amendment 

2. The claims of the CPA is based on the last amendment after the final rejection, dated on 
03/17/2003. In the last amendment, applicant amended claims 1 and 9-12, which are moot in 
view of the new ground(s) of rejection. 

Response to Arguments 

3. Applicant's arguments filed on 03/17/2003 have been fiilly considered. 

On page 10, line 3, applicant cites that "Claim 1, 6, 9-12 and 14 have been amended". 
However, claim 6 on page 4 and claim on page 7 appear to be "original"; the examiner will only 
consider claims 1 and 9-12 as the amended claims, hereinafter. 

In response the applicant's argument that "the office action on 01/15/2003 could not be 
made final" because of "newly cited art" (amendment, page 10, section I), examiner respectfully 
disagrees with the applicant. The cited reference(s) in the final action is merely further 
evidence(s) of obviousness that is recited as official notice in the previous office action and 
challenged by the applicant, so that the final rejection is proper. 
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Regarding the objections (amendment, page 1 1), applicant's insistently refuse to change 
the phrase of "text to speech" to the suggested "text-to-speech", even though examiner explained 
reason that is for avoiding potential ambiguity and for increasing searching efficiency, in last 
office action. It is noted that to do so will benefit for both the office and all patent applicants, so 
that the applicant's cooperation is requested. 

4. Regarding REJECTION UNDER 35 U.S. C. § 102": 

In response to the applicant' argument (regarding claim 6, see amendment: page 12, 
paragraphs 1-3) that "when dividing the text into words, Sharman dose not use a dictionary. As 
resuh, this portion of Sharman does not anticipate adding a textual unit to a list when the textual 
unit "corresponds to a stored textual unit in a vocabulary unit" as recited in Claim 6", the 
examiner respectfully disagrees with applicant and has a different view of the prior art teachings. 
In fact, as stated in the office action, Sharman discloses text tokenisation preprocessing 3 10 (Fig. 
3) to split input text into tokens (words), word conversion 315 to implement special rules to map 
lexical items into canonical word form (column 5, lines 3-17), which is the same or equivalent 
process as stated in the applicant's specification (see page 7, paragraph 4). Now, applicant's 
appears to suggest that when dividing the text into words it should use a dictionary, which is 
contradictory with the specification and dose not specifically reflect in the claim. Further, it is 
inherent processing nature that a dictionary can be used only after dividing the text to at least one 
word, which is disclosed by Sharman (colunm 5, lines 18-40) as stated in the office action. In 
addition, Sharman disclosed the text unit (including word) information used in different 
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processing stages (column 6, lines 61-67, and table 1), and the output buffer used when a 
component produces several output units for each input unit that receives (column 7, lines 61- 
67), which is inherently capable of storing the textural list as the claimed. 

At this point, the examiner believes that the applicant's arguments are not persuasive. 

Specification 

5. The disclosure is objected to because of the following informalities: 

Changing phrase "text to speech engine (or system)" (for example on page 1, line 1 1 and 
line 20) to "text-to-speech engine (or system)" in the application would be necessary for the 
purpose of avoiding the potential ambiguity. Appropriate correction is required. 

Oaim Rejections - 35 USC §102 
The following is a quotation of the appropriate paragraphs of 35 U.S. C. 102 that form the 
basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent iinless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in pubUc use or on 
sale in this country, more than one year prior to the date of application for patent in the United States. 

6. Claim 6 is rejected under 35 U.S.C. 102(b) as being anticipated by Sharman. 
Regarding claim 6, Sharman discloses a text to speech system. Sharman further 

discloses a linguistic processor for various linguistic processes comprising: text tokenisation 
preprocessing 310 (Fig. 3) to split input text into tokens (words), word conversion 3 15 to 
implement special rules to map lexical items into canonical word form, such as convert numbers 
to word strings and expand acronyms and abbreviations, syllabication 320 to look up and match 
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the words using a dictionary and to remove any possible prefix or suffix for a word and to brake 
a word down into constituent syllables -syllabified word (equivalent to list of textual unit) for 
further processing (column 5, lines 3-40). In addition, Sharman disclosed the text unit (including 
word) information used in different process stages (column 6, lines 61-67, and table 1), and the 
output buffer used when a component produces several output unit for each input unit that 
receives (column 7, lines 61-67). This corresponds to the claimed "a method of pre-processing a 
text file comprising: receiving a text file; parsing said text file into textual units, where each said 
parsed textual unit is one of a word, a prefix or a suffix; and for each one of said parsed textual 
units, if said one of said parsed textual units corresponds to a stored textual unit in a vocabulary 
of textual units, adding said stored textual unit to a list." 

Oaim Rejections - 35 USC § 103 
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole woiild have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentabihty shall not be negatived by the 
manner in which the invention was made. 

7. Claims 1-5, 9-15 and 18are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Sharman (USPN 5,774,854) in view of Hata et al. (USPN 5,878,393) hereinafter referenced as 
Hata. 

Regarding claim 1, Sharman discloses a text to speech system comprises: 
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a. a processing branch concerning with individual words, including component 320 
(Fig. 3) using a dictionary look-up and/or removing any possible prefix or suffix (column 
5, lines 18-27), 

b- components 325,330 and 335 performing phonetic transcription in which the 
syllabified word is broken down still further into its constituent phonemes, again using a 
dictionary look-up table (column 5, lines 30-33), an acoustic processor 220 ( Figs. 2 and 
4) preparing acoustic data by using diphone library 420 (Fig.4) (column 6, lines 25-26), 

c. the text unit (including word) information used in different processing stages 
(column 6, lines 61-67, and table 1), and 

d. output buffer 590 (Fig. 5) storing result of processing and checking output data 
sufficiency, wherein the output buffer is used when a component produces several output 
units for each input unit that receives (column 7, lines 54-67), 

which corresponds to the claimed "receiving a list of textual units, where each said textual unit is 
one of a word, a prefix or a suffix; for each unit comprising a word, locating an associated 
speech sample in a memory; pending said associated speech sample to an output signal." But, 
Sharman fails to explicitly disclose utilizing "speech sample" for the phonetic data on item b. 
above, though he cites that a diphone library 420 (Fig, 4) effectively contains prerecorded 
segments of diphones (column 6, line 25). However, the examiner contends that the concept of 
providing speech sample as phonetic data was well known, as taught by Hata. 

In the same field of endeavor, Hata discloses a high quality concatenative reading system. 
Hata further discloses that the system has a dictionary of sampled sounds 40 (Fig. 1) (column 3, 
42-45) and the individual speech samples represent discrete units of speech, such as phonemes or 
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words (column 3, line 26-3 1). Furthermore, Hata discloses multiple buffers for storing text and 
speech data in different processing stages, including input buffer 44 (Fig. 1), word list buffer 48, 
and sample list buffer 54 (column 5, lines 6-26), which is inherently capable of storing the 
textural list as the claimed. 

Therefore, it would have been obvious to one of ordinary skill in the art at time the 
invention was made to combine Sharman and Hata, to specifically provide stored speech sample 
for generating sound data, as taught by Hata, for the purpose of increasing sound quality of the 
system. 

Regarding claim 2, Sharman and Hata disclose everything claimed, as applied above (see 
claim 1). Sharman further suggests that: (i) at substring level, it is useful to include some back- 
up mechanism to be able to process words that are not in the dictionary (column 5, line 24); (ii) 
at phoneme level, it is again using a dictionary look-up table, augmented with general purpose 
rules for words not in the dictionary (column 5, line 34); which is equivalent to use "secondary 
text-to-speech engine". Further more, Sharman discloses that the phoneme data and other 
portion of data are sent to acoustic processor to produce output data stored in the output buffer 
590 (Fig. 5) (column 8, line 23-24). This corresponds to the claimed "wherein one said textual 
unit in said list is indicated as not having an associated speech sample in memory and said 
method further comprises: passing said indicated textual unit to a secondary text to speech 
engine; receiving a speech sample converted fi-om said indicated textual unit fi-om said secondary 
text to speech engine; and appending said converted speech sample to said output signal." 

Regarding claim 3, Sharman and Hata disclose everything claimed, as appUed above (see 
claim 2). But, Sharman fails to explicitly disclose that "said secondary text-to-speech engine 
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comprises a phonetic text-to-speech engine based on a voice talent". However, the examiner 
contends that the concept of utilizing a phonetic text-to-speech engine based on stored and 
processed speech sample (herein equivalent to a voice talent) was well known, as taught by Hata. 

Hata further discloses that the system has a dictionary of sampled sounds 40 (Fig. 1) 
(column 3, 42-45) and the individual speech samples each represent discrete units of speech, 
such as phonemes or words (column 3, line 26-3 1) 

Therefore, it would have been obvious to one of ordinary skill in the art at time the 
invention was made to modify Sharman by specifically providing a phonetic text-to-speech 
engine based on stored and processed speech sample for a TTS engine, as taught by Hata, for the 
purpose of increasing sound quality for the system. 

Regarding claim 4, Sharman and Hata disclose everything claimed, as applied above (see 
claim 1). Sharman also discloses that processing input text at the substring level is based on a 
syllabified word (column 5, line 31), which inherently satisfies all limitation elements as claimed 
"wherein a consecutive plurality of said textual units in said list represent a whole word, said 
method further comprising: for each textual unit in said consecutive plurality of said textual 
units, locating an associated speech sample in said memory; creating a speech unit by splicing 
together said plurality of associated speech samples; and appending said speech unit to said 
output signal" 

Regarding claim 5, Sharman and Hata disclose everything claimed, as applied above (see 
claim 4). Sharman further discloses components of identifying diphones 410 (Fig. 4), diphone 
library 420 and diphone concatenation 415 for overcoming audible discontinuities (column 6, 
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lines 34-40), which corresponds to the claimed "after said splicing, processing said speech unit to 
remove discontinuities." 

Regarding claim 9, it discloses an apparatus, which corresponds to the method of 
claim 1; the apparatus is obvious in that it simply provides structure for the functionality 
found in claim 1 . 

Regarding claim 10, it discloses an apparatus, which corresponds to the method of 
claim 1; the apparatus is obvious in that it simply provides structure for the functionality 
found in claim 1 . In addition, Sharman specifically discloses that the TTS system includes 
two microprocessors (column 3, line 17), which corresponds to the claimed "a text to speech 
converter comprising a processor operable to . . 

Regarding claim 11, it discloses an apparatus, which corresponds to the method of claim 
1; the apparatus is obvious in that it simply provides structure for the functionality found in claim 
1. In addition, Sharman specifically discloses that an arrangement is particularly suitable for a 
workstation (equivalent to computer) equipped with an adapter card with its own DSP 
(equivalent to processor) (colunm 3, line 21), which corresponds to the claimed "a computer 
readable medium for providing program control to a processor, said processor included in a text 
to speech converter, said computer readable medium adapting said processor to be operable to 

Regarding claim 12, it discloses an apparatus, which corresponds to a combination of the 
method of claim 1 and the method of claim 6; the apparatus is obvious in that it simply provides 
structure for the functionality found in claim 1 and claim 6. 

Regarding claim 13, it is canceled. 
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Regarding claim 14, it discloses a data structure, which is used in and corresponds to the 
method of claim 1; the data structure is obvious in that it simply provides a part of software 
structure for the fiinctionality found in claim 1. 

Regarding claim 15, Sharman and Hata disclose everything claimed, as applied above 
(see claim 14). But, Sharman and Hata fail to explicitly disclose a data structure "further 
comprising a field for a phoneme that said textual unit starts with, and a field for a phoneme that 
the textual unit ends with" as claimed. However, the examiner contends that the concept of 
utilizing fields for a beginning phoneme and an ending phoneme in a data structure was well 
known, as taught by Hata. 

In the same field of endeavor, Hata discloses a high quality concatenative reading system. 
Hata fiirther discloses that a phonological feature table (an array type of data structure) 52 (Fig. 
3), comprising fields of phonemes that a word may begin and end with (column 5, lines 14-31, 
and colunrn 7, lines 55-59). 

Therefore, it would have been obvious to one of ordinary skill in the art at time the 
invention was made to modify Sharman and Hata by specifically providing fields for a beginning 
phoneme and an ending phoneme for existing data structure, as taught by Hata, for the purpose 
of obtaining better sound quality. 

Regarding claim 18, it depends on the claim 12; and it discloses an apparatus, which 
corresponds to a combination of the method of claim 7 and the method of claim 16; the apparatus 
is obvious in that it simply provides structure for the functionality found in claim 7 and claim 16. 
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8. Claim 7 is rejected under 35 U.S.C. 103(a) as being unpatentable over Sharman in view 
of Microsoft Press ("Computer Dictionary", page 298) hereinafter referenced as Rl. 

Regarding claim 7, Sharman discloses everything claimed, as applied above (see claim 
6). Sharman particularly discloses that apart fi*om using a dictionary look-up, "it is usefiil to 
include some back-up mechanism to be able to process words that are not in the dictionary" 
(column 5, lines 24-26), which is corresponding to the claimed "if said one of said parsed textual 
units does not correspond to one of said stored textual units" and "as being out of vocabulary." 
Sharman fiirther cites that "the output unit represents the size of the text unit (e.g. word, 
sentence, phoneme); for many stages this is accompanied by additional information for that unit 
(e.g., duration, part of speech etc.)" (column 6, line 59 to column 7, line 2), which suggests that 
the text unit may be different in each of processing stages. But, Sharman fails to explicitly 
disclose to mark a text unit that does not match the one either in dictionary or by rule sets. 
However, the examiner contends that the concept of marking a text unit data was well knovm, as 
taught byRl. 

Rl is a popular computer dictionary that gives common meaning and explanation of 
words or phrases in computer related arts. Rl fiirther discloses that one of the common 
meanings of the word "mark" is "in applications and data storage, a symbol or other device used 
to distinguish one item from others like it" (page 298, entry "mark"), so that when using "mark" 
as a verb, it can be interpreted as an action to mark a symbol for certain data in a data storage, 
such as used for "text unit", for distinguishing the data from other data. 

Therefore, it would have been obvious to one of ordinary skill in the art at time the 
invention was made to modify Sharman by specifically marking a text unit of the processed data. 
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as taught by Rl, for the purpose of distinguishing the text unit that is not in the dictionary and 
preparing for further processing stages, such as processing in a back-up mechanism, generating 
phonemes, coping with prosodic information (Sharman, colunm 5, lines 25-26, column 5, lines 
30-56 and column 5, lines 26). 

Regarding claim 16, Sharman and Rl disclose everything claimed, as applied above (see 
claim 7). Sharman further suggests that: (i) at substring level, it is useful to include some back- 
up mechanism to be able to process words that are not in the dictionary (colunm 5, line 24); (ii) 
at phoneme level, it is again using a dictionary look-up table, augmented with general purpose 
rules for words not in the dictionary (column 5, line 34); which is equivalent to use "secondary 
text to speech engine". Further more, Sharman discloses that the buffer may be used for storing 
multi-stage input and output (column 7, lines 61-67) for different text units depending on the 
process stage (column 6, line 61 to column 7, line 22), which inherently includes process stage(s) 
in secondary TTS engine. This corresponds to the claimed "passing said marked textual unit to 
a secondary text to speech engine, receiving a speech sample converted from said marked textual 
unit from said secondary text to speech engine, and appending said converted speech sample to 
said output signal." 

9. Claim 8 is rejected under 35 U.S.C, 103(a) as being unpatentable over Sharman in view 
of Rl and further in view of O'Donnell ("programming for the world— a guide to 
internationalization", ISBN 0- 1 3-722 1 90-8). 

Regarding claim 8, Sharman and Rl disclose everything claimed, as applied above (see 
claim 7). But, Sharman and Rl fail to disclose that "said marking comprises pre-pending a 
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character to said textual unit." However, the examiner contends that the concept of marking a 
text unit by using a pre-pending character was well known, as taught by OT)onnell. 

OT)onnell writes a book of "programming for the world", which discloses that appending 
a character symbol "$" to a digit string for distinguishing monetary amount from normal number 
(page 49, table 2. 11). 

Therefore, it would have been obvious to one of ordinary skill in the art at time the 
invention was made to modify Sharman and Rl by specifically marking a text unit of the 
processed data by adding a character, such as "$" or the like, in front of the text units, as taught 
by O'Donnell, for the purpose of easily distinguishing the text units and preparing for further 
processing. 

Regarding claim 17, Sharman, Rl and O'Donnell disclose everything claimed, as appHed 
above (see claim 8). Sharman further suggests that: (i) at substring level, it is useful to include 
some back-up mechanism to be able to process words that are not in the dictionary (column 5, 
line 24); (ii) at phoneme level, it is again using a dictionary look-up table, augmented with 
general purpose rules for words not in the dictionary (column 5, line 34); which is equivalent to 
use "secondary text to speech engine". Further more, Sharman discloses that the buffer may be 
used for storing multi-stage input and output (column 7, lines 61-67) for different text units 
depending on the process stage (column 6, line 61 to column 7, line 22), which inherently 
includes process stage(s) in secondary TTS engine. This corresponds to the claimed "passing 
said marked textual unit to a secondary text to speech engine; receiving a speech sample 
converted from said marked textual unit from said secondary text to speech engine; and 
appending said converted speech sample to said output signal." 
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Conclusion 



10. Any response to this office action should be mailed to: 

Commissioner of Patents and Trademarks, Washington D C. 2023 1 

or faxed to: 

(703)-872-9314 

Hand-delivered responses should be brought to: 

Crystal Park n, 2121 Crystal Drive, Arlington. VA Sixth Floor (Receptionist). 

Any inquiry concerning this communication or earlier communications fi-om the 
examiner should be directed to Qi Han whose telephone numbers is (703) 305-5631. The 
examiner can normally be reached on Monday through Thursday from 8:00 a.m. to 5:30 p.m. and 
Friday from 8:00 a.m. to 12:00 a.m. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Marsha Banks-Harold, can be reached on (703) 305-4379. 

Any inquiry of a general nature of relating to the status of this application or proceeding 
should be directed to the Technology Center 2600 Customer Service Office whose telephone 
number is (703) 306-0377. , . 

QH/qh MARSHA D. BANKS-HAROLD 

May 16, 2003 SUPERVISORY PATENT EXAMINER 

TECHNOLOGY CENTER 2600 



